Search CORE

638 research outputs found

Towards Universal Semantic Tagging

Author: Abzianidze Lasha
Bos Johan
Publication venue
Publication date: 29/09/2017
Field of study

The paper proposes the task of universal semantic tagging---tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b) they are suitable for cross-lingual semantic parsing. An application of the semantic tagging in the Parallel Meaning Bank supports both of these points as the tags contribute to formal lexical semantics and their cross-lingual projection. As a part of the application, we annotate a small corpus with the semantic tags and present new baseline result for universal semantic tagging.Comment: 9 pages, International Conference on Computational Semantics (IWCS

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Neural Semantic Parsing by Character-based Translation: Experiments with Abstract Meaning Representations

Author: Bos Johan
van Noord Rik
Publication venue
Publication date: 09/10/2017
Field of study

We evaluate the character-level translation method for neural semantic parsing on a large corpus of sentences annotated with Abstract Meaning Representations (AMRs). Using a sequence-to-sequence model, and some trivial preprocessing and postprocessing of AMRs, we obtain a baseline accuracy of 53.1 (F-score on AMR-triples). We examine five different approaches to improve this baseline result: (i) reordering AMR branches to match the word order of the input sentence increases performance to 58.3; (ii) adding part-of-speech tags (automatically produced) to the input shows improvement as well (57.2); (iii) So does the introduction of super characters (conflating frequent sequences of characters to a single character), reaching 57.4; (iv) optimizing the training process by using pre-training and averaging a set of models increases performance to 58.7; (v) adding silver-standard training data obtained by an off-the-shelf parser yields the biggest improvement, resulting in an F-score of 64.0. Combining all five techniques leads to an F-score of 71.0 on holdout data, which is state-of-the-art in AMR parsing. This is remarkable because of the relative simplicity of the approach.Comment: Camera ready for CLIN 2017 journa

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

The Meaning Factory at SemEval-2017 Task 9: Producing AMRs with Neural Semantic Parsing

Author: Bos Johan
van Noord Rik
Publication venue
Publication date: 01/01/2017
Field of study

We evaluate a semantic parser based on a character-based sequence-to-sequence model in the context of the SemEval-2017 shared task on semantic parsing for AMRs. With data augmentation, super characters, and POS-tagging we gain major improvements in performance compared to a baseline character-level model. Although we improve on previous character-based neural semantic parsing models, the overall accuracy is still lower than a state-of-the-art AMR parser. An ensemble combining our neural semantic parser with an existing, traditional parser, yields a small gain in performance.Comment: To appear in Proceedings of SemEval, 2017 (camera-ready

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Semantic Tagging with Deep Residual Networks

Author: Bjerva Johannes
Bos Johan
Plank Barbara
Publication venue
Publication date: 31/10/2016
Field of study

We propose a novel semantic tagging task, sem-tagging, tailored for the purpose of multilingual semantic parsing, and present the first tagger using deep residual networks (ResNets). Our tagger uses both word and character representations and includes a novel residual bypass architecture. We evaluate the tagset both intrinsically on the new task of semantic tagging, as well as on Part-of-Speech (POS) tagging. Our system, consisting of a ResNet and an auxiliary loss function predicting our semantic tags, significantly outperforms prior results on English Universal Dependencies POS tagging (95.71% accuracy on UD v1.2 and 95.67% accuracy on UD v1.3).Comment: COLING 2016, camera ready versio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Predicate logic unplugged

Author: Bos Johan
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1995
Field of study

this paper we describe the syntax and semantics of a description language for underspecified semantic representations. This concept is discussed in general and in particular applied to Predicate Logic and Discourse Representation Theory. The reason for exploring underspecified representations as suitable semantic representations for natural language expressions emerges directly from practical natural language processing applications. The so-called Combinatorial Explosion Puzzle, a well known problem in this area, can succesfully be tackled by using underspecified representations. The source of this problem, scopal ambiguities in natural language expressions, is discussed in section 2. The core of the paper presents Hole Semantics. This is a general proposal for a framework, in principle suitable for any logic, where underspecified representations play a central role. There is a clear separation between the object language (the logical language one is interested in) and the meta language (the language that describes and interprets underspecified structures). It has been noted by various authors that the meaning of an underspecified semantic representation cannot be expressed in terms of a disjunction of denotations, but rather as a set of denotations (cf. Poesio 1994). We support this view, and use it as underlying principle for the definition of the semantic interpretation function of underspecified structures. Section 3 is an informal introduction to Hole Semantics, and in section 4 things are formally defined. In section 5 we apply Hole Semantics to Predicate Logic, resulting in an "unplugged" version of (static and dynamic) Predicate Logic. In section 6 we show that this idea easily carries over to Discourse Representation Structures. A lot of attention has been paid..

CiteSeerX

Universaar

Acronym

Applying automated deduction to natural language understanding

Author: Bos Johan
Publication venue: Elsevier Inc.
Publication date: 31/03/2009
Field of study

AbstractVery few natural language understanding applications employ methods from automated deduction. This is mainly because (i) a high level of interdisciplinary knowledge is required, (ii) there is a huge gap between formal semantic theory and practical implementation, and (iii) statistical rather than symbolic approaches dominate the current trends in natural language processing. Moreover, abduction rather than deduction is generally viewed as a promising way to apply reasoning in natural language understanding. We describe three applications where we show how first-order theorem proving and finite model construction can efficiently be employed in language understanding.The first is a text understanding system building semantic representations of texts, developed in the late 1990s. Theorem provers are here used to signal inconsistent interpretations and to check whether new contributions to the discourse are informative or not. This application shows that it is feasible to use general-purpose theorem provers for first-order logic, and that it pays off to use a battery of different inference engines as in practice they complement each other in terms of performance.The second application is a spoken-dialogue interface to a mobile robot and an automated home. We use the first-order theorem prover spass for checking inconsistencies and newness of information, but the inference tasks are complemented with the finite model builder mace used in parallel to the prover. The model builder is used to check for satisfiability of the input; in addition, the produced finite and minimal models are used to determine the actions that the robot or automated house has to execute. When the semantic representation of the dialogue as well as the number of objects in the context are kept fairly small, response times are acceptable to human users.The third demonstration of successful use of first-order inference engines comes from the task of recognising entailment between two (short) texts. We run a robust parser producing semantic representations for both texts, and use the theorem prover vampire to check whether one text entails the other. For many examples it is hard to compute the appropriate background knowledge in order to produce a proof, and the model builders mace and paradox are used to estimate the likelihood of an entailment

Elsevier - Publisher Connector

Focusing particles & ellipsis resolution

Author: Bos Johan
Publication venue: Sonstige Einrichtungen. DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Publication date: 01/01/1994
Field of study

We present a semantic framework which integrates a compositional version of Discourse Representation Theory, Van der Sandt\u27s presupposition theory, and a treatment of focus in the style of Rooth\u27s Alternative Semantics. We will discuss the semantics of focusing particles like too and only within this framework. The function of these particles is maintaining coherence in discourse or dialogue. This explicitly allows them to introduce contrast between phrases by means of presupposition. Of our interest is the interaction between focusing particles and elliptical phrases. In particular, we pay attention to cases of VP-ellipsis in English. It turns out that the interpretation of focusing particles naturally accounts for the occurences of sloppy and strict readings in VP-ellipsis. This is because their presupposition adds contrast between the source and target clause. This feature distinguishes the approach sketched in this paper from known approaches to ellipsis, which disregard the function of focusing particles

Universaar

Acronym

Separating Argument Structure from Logical Structure in AMR

Author: Bos Johan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

ARTS repository - University of Groningen

A semantically annotated corpus of tombstone inscriptions

Author: Bos Johan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/04/2022
Field of study

The digital preservation of funerary material is of interest to many different scientific disciplines. Textual information found on tombstones often goes far beyond the expected (name of the deceased, dates of birth and death), and may include information about commemorators, family roles, occupations, references to biblical or other texts, places of birth and death, cause of death, epitaphs and poems. Gravestones are multi-modal media, and besides text are often decorated with artistic symbols. To capture this information in a systematic way and make it available on a large scale for research purposes, a meaning representation based on linking entities by relations has been designed that will extend search capabilities beyond simple string matches. Concepts are represented as WordNet synsets, and a vocabulary of 32 relations make connections between concepts. This formalisation has been developed and evaluated based on a dataset of more than 1,000 Dutch tombstones

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen